智能论文笔记

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

FETA: Towards Specializing Foundation Models for Expert Task Applications

Amit Alfassy , Assaf Arbelle , Oshri Halimi , Sivan Harary , Roei Herzig , Eli Schwartz , Rameswar Panda , Michele Dolfi , Christoph Auer , Kate Saenko

分类：计算机视觉

2022-09-08

基础模型（FMS）已证明了前所未有的功能，包括零拍学习，高保真数据合成和范围内的概括。但是，正如我们在本文中所显示的那样，FMS在专家任务上的开箱即用表现较差（例如，从语言查询中检索汽车手册技术插图），数据是看不见的，或者属于长尾的数据用于FM预训练的大型数据集的数据分布的一部分。这强调了在此类专家任务上明确评估和芬太尼FMS的必要性，这可以说是在实际现实世界中最重要的任务。在本文中，我们提出了围绕教授FMS了解技术文档的任务，通过学习将其图形插图与相应的语言描述相匹配的任务围绕着了解技术文档的任务。我们的FETA基准重点是公共汽车手册和销售目录手册中的文本对图像和图像到文本检索。 FETA配备了完全自动注释提取的程序（接受后将发布代码），从而使Feta轻松扩展到将来更多的文档类型和应用域。我们的自动注释导致自动性能指标显示，该指标与在人类策划注释中计算的指标一致（也发布）。我们提供多个基线和对FETA的流行FM的分析，从而导致一些有趣的发现，我们认为这对FM社区非常有价值，为现实世界中FMS应用于当前被标准基准的“忽视”的实践专家任务铺平了道路。在常见对象上。

translated by 谷歌翻译

Unsupervised Domain Generalization by Learning a Bridge Across Domains

Sivan Harary , Eli Schwartz , Assaf Arbelle , Peter Staar , Shady Abu-Hussein , Elad Amrani , Roei Herzig , Amit Alfassy , Raja Giryes , Hilde Kuehne

分类：计算机视觉

2021-12-04

概括跨越不同视觉域的学习表现的能力，例如在真正的照片，剪贴画，绘画和草图之间是人类视觉系统的基本容量。在本文中，不同于利用一些（或全部）源域监控的大多数跨域工作，我们接近一个相对较新的，非常实用的无监督域泛化（UDG）设置在既不源也不在源域中没有培训监督。我们的方法是基于跨域（BRAD）的桥梁的自我监督学习 - 辅助桥域附有一组从每个训练域的Brad将视觉（图像到图像）映射保留的一组语义。 BRAD和MAPPAPAPPED（端到端）与对比的自我监督表示模型一起学习（端到端），其用语义对齐每个域将每个域对齐，因此隐含地驱动所有域（见或看不见）语义上彼此对齐。在这项工作中，我们展示了如何使用边缘正则化的布拉德，我们的方法在多个基准和一系列任务中实现了显着的增益，包括UDG，少量UDA和跨多个域数据集的无监督概括（包括指向未经看明域的概念和课程）。

translated by 谷歌翻译

Cross-Domain Video Anomaly Detection without Target Domain Adaptation

Abhishek Aich , Kuan-Chuan Peng , Amit K. Roy-Chowdhury

分类：计算机视觉

2022-12-14

Most cross-domain unsupervised Video Anomaly Detection (VAD) works assume that at least few task-relevant target domain training data are available for adaptation from the source to the target domain. However, this requires laborious model-tuning by the end-user who may prefer to have a system that works ``out-of-the-box." To address such practical scenarios, we identify a novel target domain (inference-time) VAD task where no target domain training data are available. To this end, we propose a new `Zero-shot Cross-domain Video Anomaly Detection (zxvad)' framework that includes a future-frame prediction generative model setup. Different from prior future-frame prediction models, our model uses a novel Normalcy Classifier module to learn the features of normal event videos by learning how such features are different ``relatively" to features in pseudo-abnormal examples. A novel Untrained Convolutional Neural Network based Anomaly Synthesis module crafts these pseudo-abnormal examples by adding foreign objects in normal video frames with no extra training cost. With our novel relative normalcy feature learning strategy, zxvad generalizes and learns to distinguish between normal and abnormal frames in a new target domain without adaptation during inference. Through evaluations on common datasets, we show that zxvad outperforms the state-of-the-art (SOTA), regardless of whether task-relevant (i.e., VAD) source training data are available or not. Lastly, zxvad also beats the SOTA methods in inference-time efficiency metrics including the model size, total parameters, GPU energy consumption, and GMACs.

translated by 谷歌翻译

A Neural ODE Interpretation of Transformer Layers

Yaofeng Desmond Zhong , Tongtao Zhang , Amit Chakraborty , Biswadip Dey

分类：机器学习 | 人工智能

2022-12-12

Transformer layers, which use an alternating pattern of multi-head attention and multi-layer perceptron (MLP) layers, provide an effective tool for a variety of machine learning problems. As the transformer layers use residual connections to avoid the problem of vanishing gradients, they can be viewed as the numerical integration of a differential equation. In this extended abstract, we build upon this connection and propose a modification of the internal architecture of a transformer layer. The proposed model places the multi-head attention sublayer and the MLP sublayer parallel to each other. Our experiments show that this simple modification improves the performance of transformer networks in multiple tasks. Moreover, for the image classification task, we show that using neural ODE solvers with a sophisticated integration scheme further improves performance.

translated by 谷歌翻译

DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering

Amit Aflalo , Shai Bagon , Tamar Kashti , Yonina eldar

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-12

Image segmentation is a fundamental task in computer vision. Data annotation for training supervised methods can be labor-intensive, motivating unsupervised methods. Some existing approaches extract deep features from pre-trained networks and build a graph to apply classical clustering methods (e.g., $k$-means and normalized-cuts) as a post-processing stage. These techniques reduce the high-dimensional information encoded in the features to pair-wise scalar affinities. In this work, we replace classical clustering algorithms with a lightweight Graph Neural Network (GNN) trained to achieve the same clustering objective function. However, in contrast to existing approaches, we feed the GNN not only the pair-wise affinities between local image features but also the raw features themselves. Maintaining this connection between the raw feature and the clustering goal allows to perform part semantic segmentation implicitly, without requiring additional post-processing steps. We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training our image segmentation GNN. Additionally, we use the Correlation-Clustering (CC) objective to perform clustering without defining the number of clusters ($k$-less clustering). We apply the proposed method for object localization, segmentation, and semantic part segmentation tasks, surpassing state-of-the-art performance on multiple benchmarks.

translated by 谷歌翻译

Detection Selection Algorithm: A Likelihood based Optimization Method to Perform Post Processing for Object Detection

Angzhi Fan , Benjamin Ticknor , Yali Amit

分类：计算机视觉

2022-12-12

In object detection, post-processing methods like Non-maximum Suppression (NMS) are widely used. NMS can substantially reduce the number of false positive detections but may still keep some detections with low objectness scores. In order to find the exact number of objects and their labels in the image, we propose a post processing method called Detection Selection Algorithm (DSA) which is used after NMS or related methods. DSA greedily selects a subset of detected bounding boxes, together with full object reconstructions that give the interpretation of the whole image with highest likelihood, taking into account object occlusions. The algorithm consists of four components. First, we add an occlusion branch to Faster R-CNN to obtain occlusion relationships between objects. Second, we develop a single reconstruction algorithm which can reconstruct the whole appearance of an object given its visible part, based on the optimization of latent variables of a trained generative network which we call the decoder. Third, we propose a whole reconstruction algorithm which generates the joint reconstruction of all objects in a hypothesized interpretation, taking into account occlusion ordering. Finally we propose a greedy algorithm that incrementally adds or removes detections from a list to maximize the likelihood of the corresponding interpretation. DSA with NMS or Soft-NMS can achieve better results than NMS or Soft-NMS themselves, as is illustrated in our experiments on synthetic images with mutiple 3d objects.

translated by 谷歌翻译

YolOOD: Utilizing Object Detection Concepts for Out-of-Distribution Detection

Alon Zolfi , Guy Amit , Amit Baras , Satoru Koda , Ikuya Morikawa , Yuval Elovici , Asaf Shabtai

分类：计算机视觉 | 机器学习

2022-12-05

Out-of-distribution (OOD) detection has attracted a large amount of attention from the machine learning research community in recent years due to its importance in deployed systems. Most of the previous studies focused on the detection of OOD samples in the multi-class classification task. However, OOD detection in the multi-label classification task remains an underexplored domain. In this research, we propose YolOOD - a method that utilizes concepts from the object detection domain to perform OOD detection in the multi-label classification task. Object detection models have an inherent ability to distinguish between objects of interest (in-distribution) and irrelevant objects (e.g., OOD objects) on images that contain multiple objects from different categories. These abilities allow us to convert a regular object detection model into an image classifier with inherent OOD detection capabilities with just minor changes. We compare our approach to state-of-the-art OOD detection methods and demonstrate YolOOD's ability to outperform these methods on a comprehensive suite of in-distribution and OOD benchmark datasets.

translated by 谷歌翻译

Sequential parametrized motion planning and its complexity, II

Michael Farber , Amit Kumar Paul

分类：机器人

2022-12-02

This is a continuation of our recent paper in which we developed the theory of sequential parametrized motion planning. A sequential parametrized motion planning algorithm produced a motion of the system which is required to visit a prescribed sequence of states, in a certain order, at specified moments of time. In the previous publication we analysed the sequential parametrized topological complexity of the Fadell - Neuwirth fibration which in relevant to the problem of moving multiple robots avoiding collisions with other robots and with obstacles in the Euclidean space. Besides, in the preceeding paper we found the sequential parametrised topological complexity of the Fadell - Neuwirth bundle for the case of the Euclidean space $\Bbb R^d$ of odd dimension as well as the case $d=2$. In the present paper we give the complete answer for an arbitrary $d\ge 2$ even. Moreover, we present an explicit motion planning algorithm for controlling multiple robots in $\Bbb R^d$ having the minimal possible topological complexity; this algorithm is applicable to any number $n$ of robots and any number $m\ge 2$ of obstacles.

translated by 谷歌翻译

When Neural Networks Fail to Generalize? A Model Sensitivity Perspective

Jiajin Zhang , Hanqing Chao , Amit Dhurandhar , Pin-Yu Chen , Ali Tajer , Yangyang Xu , Pingkun Yan

分类：计算机视觉 | 人工智能

2022-12-01

Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario,namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as "model sensitivity". Based on our analysis, we propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies. Models trained with these hard-to-learn samples can effectively suppress the sensitivity in the frequency space, which leads to improved generalization performance. Extensive experiments on multiple public datasets demonstrate the superiority of our approach, which surpasses the state-of-the-art single-DG methods.

translated by 谷歌翻译